skip to main content


Search for: All records

Creators/Authors contains: "Jia, Haiyan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The increasing availability of data search tools brings opportunities for non-expert users. Among these users, interdisciplinary researchers and data journalists represent a growing population whose work can lead to societal benefit. Through in-depth interviews, we examine what strategies and approaches researchers and journalists adopt to search online data, how they apply current technology to facilitate dataset search, and the barriers and difficulties that they encounter in their work with data. Our findings reveal that with technological limitations in the aspects of searchability, interactivity and usability, dataset search for non-experts remains a challenge. We have found that little attention has been paid to non-experts’ emerging data need, significantly constraining the design and development of technological tools for supporting non-expert users. Our findings underline the critical impact of the design, development and deployment of technological tools to enable the meaningful use of today’s increasingly available data toward a civil society. 
    more » « less
  2. Alonso, Omar ; Marchesin, Stefano ; Najork, Mark ; Silvello, Gianmaria (Ed.)
    We present a novel approach to dataset search and exploration. Cell-centric indexing is a unique indexing strategy that enables a powerful, new interface. The strategy treats individual cells of a table as the indexed unit, and combining this with a number of structure-specific fields enables queries that cannot be answered by a traditional indexing approach. Our interface provides users with an overview of a dataset repository, and allows them to efficiently use various facets to explore the collection and identify datasets that match their interests. 
    more » « less
  3. Norwick, Katja (Ed.)
    Abstract In plants, miRNA production is orchestrated by a suite of proteins that control transcription of the pri-miRNA gene, post-transcriptional processing and nuclear export of the mature miRNA. Post-transcriptional processing of miRNAs is controlled by a pair of physically interacting proteins, hyponastic leaves 1 (HYL1) and Dicer-like 1 (DCL1). However, the evolutionary history and structural basis of the HYL1–DCL1 interaction is unknown. Here we use ancestral sequence reconstruction and functional characterization of ancestral HYL1 in vitro and in Arabidopsis thaliana to better understand the origin and evolution of the HYL1–DCL1 interaction and its impact on miRNA production and plant development. We found the ancestral plant HYL1 evolved high affinity for both double-stranded RNA (dsRNA) and its DCL1 partner before the divergence of mosses from seed plants (∼500 Ma), and these high-affinity interactions remained largely conserved throughout plant evolutionary history. Structural modeling and molecular binding experiments suggest that the second of two dsRNA-binding motifs (DSRMs) in HYL1 may interact tightly with the first of two C-terminal DCL1 DSRMs to mediate the HYL1–DCL1 physical interaction necessary for efficient miRNA production. Transgenic expression of the nearly 200 Ma-old ancestral flowering-plant HYL1 in A. thaliana was sufficient to rescue many key aspects of plant development disrupted by HYL1− knockout and restored near-native miRNA production, suggesting that the functional partnership of HYL1–DCL1 originated very early in and was strongly conserved throughout the evolutionary history of terrestrial plants. Overall, our results are consistent with a model in which miRNA-based gene regulation evolved as part of a conserved plant “developmental toolkit.” 
    more » « less
  4. null (Ed.)
    Abstract Functional and architectural diversification of transcription factor families has played a central role in the independent evolution of complex development in plants and animals. Here, we investigate the role of architectural constraints on evolution of B3 DNA binding domains that regulate plant embryogenesis. B3 domains of ABI3, FUS3, LEC2 and VAL1 proteins recognize the same cis-element. Complex architectures of ABI3 and VAL1 integrate cis-element recognition with other signals, whereas LEC2 and FUS3 have reduced architectures conducive to roles as pioneer activators. In yeast and plant in vivo assays, B3 domain functions correlate with architectural complexity of the parent transcription factor rather than phylogenetic relatedness. In a complex architecture, attenuated ABI3-B3 and VAL1-B3 activities enable integration of cis-element recognition with hormone signaling, whereas hyper-active LEC2-B3 and FUS3-B3 over-ride hormonal control. Three clade-specific amino acid substitutions (β4-triad) implicated in interactions with the DNA backbone account for divergence of LEC2-B3 and ABI3-B3. We find a striking correlation between differences in in vitro DNA binding affinity and in vivo activities of B3 domains in plants and yeast. Our results highlight the role of DNA backbone interactions that preserve DNA sequence specificity in adaptation of B3 domains to functional constraints associated with domain architecture. 
    more » « less
  5. null (Ed.)
    ncreasingly, large collections of datasets are made available to the public via the Web, ranging from government-curated datasets like those of data.gov to communally-sourced datasets such as Wikipedia tables. It has become clear that traditional search techniques are insufficient for such sources, especially when the user is unfamiliar with the terminology used by the creators of the relevant datasets. We propose to address this problem by elevating the datum to a first-class object that is indexed, thereby making it less dependent on how a dataset is structured. In a data table, a cell contains a value for a particular row as described by a particular column. In our cell-centric indexing approach, we index the metadata of each cell, so that information about its dataset and column simply become metadata rather than constraining concepts. In this paper we define cell-centric indexing and present a system architecture that supports its use in exploring datasets. We describe how cell-centric indexing can be implemented using traditional information retrieval technology and evaluate the scalability of the architecture. 
    more » « less
  6. Abstract

    Vernalization genes underlying dramatic differences in flowering time between spring wheat and winter wheat have been studied extensively, but little is known about genes that regulate subtler differences in flowering time among winter wheat cultivars, which account for approximately 75% of wheat grown worldwide. Here, we identify a gene encoding anO-linkedN-acetylglucosamine (O-GlcNAc) transferase (OGT) that differentiates heading date between winter wheat cultivars Duster and Billings. We clone thisTaOGT1gene from a quantitative trait locus (QTL) for heading date in a mapping population derived from these two bread wheat cultivars and analyzed in various environments. Transgenic complementation analysis shows that constitutive overexpression ofTaOGT1bfrom Billings accelerates the heading of transgenic Duster plants.TaOGT1 is able to transfer anO-GlcNAc group to wheat proteinTaGRP2. Our findings establish important roles forTaOGT1in winter wheat in adaptation to global warming in the future climate scenarios.

     
    more » « less
  7. A search engine's ability to retrieve desirable datasets is important for data sharing and reuse. Existing dataset search engines typically rely on matching queries to dataset descriptions. However, a user may not have enough prior knowledge to write a query using terms that match with description text. We propose a novel schema label generation model which generates possible schema labels based on dataset table content. We incorporate the generated schema labels into a mixed ranking model which not only considers the relevance between the query and dataset metadata but also the similarity between the query and generated schema labels. To evaluate our method on real-world datasets, we create a new benchmark specifically for the dataset retrieval task. Experiments show that our approach can effectively improve the precision and NDCG scores of the dataset retrieval task compared with baseline methods. We also test on a collection of Wikipedia tables to show that the features generated from schema labels can improve the unsupervised and supervised web table retrieval task as well. 
    more » « less
  8. In animals, endocytosis of a seven-transmembrane GPCR is mediated by arrestins to propagate or arrest cytoplasmic G protein–mediated signaling, depending on the bias of the receptor or ligand, which determines how much one transduction pathway is used compared to another. InArabidopsis thaliana, GPCRs are not required for G protein–coupled signaling because the heterotrimeric G protein complex spontaneously exchanges nucleotide. Instead, the seven-transmembrane protein AtRGS1 modulates G protein signaling through ligand-dependent endocytosis, which initiates derepression of signaling without the involvement of canonical arrestins. Here, we found that endocytosis of AtRGS1 initiated from two separate pools of plasma membrane: sterol-dependent domains and a clathrin-accessible neighborhood, each with a select set of discriminators, activators, and candidate arrestin-like adaptors. Ligand identity (either the pathogen-associated molecular pattern flg22 or the sugar glucose) determined the origin of AtRGS1 endocytosis. Different trafficking origins and trajectories led to different cellular outcomes. Thus, in this system, compartmentation with its associated signalosome architecture drives biased signaling.

     
    more » « less